A Fuzzy Style Clustering Algorithm on Stylistic Data
SHEN Hao1,2 , WANG Shitong1,2
1.School of Digital Media, Jiangnan University, Wuxi 214122 2. Jiangsu Key Laboratory of Media Design and Software Technology, Jiangnan University, Wuxi 214122
Abstract:In the stylistic data, different organizational styles obviously or implicitly exist in different clusters. The classical partitional clustering methods represented by K-means and fuzzy C-means are ineffective for the stylistic data. Therefore, a fuzzy style clustering(FSC) is proposed. A style normalization matrix is utilized to represent the style information of the samples within each cluster, and the distance matrix is calculated with samples transformed by style normalization matrices. Besides, the fuzzy membership is exploited to describe the representable degree of a sample for a certain cluster. The membership matrix and style normalization matrix are optimized simultaneously by the commonly-used alternating optimization technique. FSC can make use of the style information of samples and the information between samples and clusters effectively, and the experimental results on synthetic and real datasets indicate the effectiveness of the proposed algorithm.
[1] WANG J, WANG S T, CHUNG F L, et al. Fuzzy Partition Based Soft Subspace Clustering and Its Applications in High Dimensional Data. Information Sciences, 2013, 246: 133-154. [2] 应文豪,王士同.正交模糊k平面聚类算法.模式识别与人工智能, 2011, 24(6): 783-791. (YING W H, WANG S T. Orthogonal Fuzzy k-Plane Clustering Algorithm. Pattern Recognition and Artificial Intelligence, 2011, 24(6): 783-791.) [3] GOLDBERGER J, TASSA T. A Hierarchical Clustering Algorithm Based on the Hungarian Method. Pattern Recognition Letters, 2008, 29(11): 1632-1638. [4] 谢振平,王士同,王晓明.一种基于软边界球分的分裂式层次聚类算法.模式识别与人工智能, 2008, 21(4): 559-568. (XIE Z P, WANG S T, WANG X M. A Divisive Hierarchical Clustering Algorithm Based on Soft Hyperspheric Partition. Pattern Reco-gnition and Artificial Intelligence, 2008, 21(4): 559-568.) [5] YANG Z, CHUNG F L, WANG S T. Robust Fuzzy Clustering-Based Image Segmentation. Applied Soft Computing, 2009, 9(1): 80-84. [6] WU K L, YU J, YANG M S. A Novel Fuzzy Clustering Algorithm Based on a Fuzzy Scatter Matrix with Optimality Tests. Pattern Reco-gnition Letters, 2005, 26(5): 639-652. [7] 祁宏宇,吴小俊,王士同,等.一种协同的FCPM模糊聚类算法.模式识别与人工智能, 2010, 23(1): 120-126. (QI H Y, WU X J, WANG S T, et al. Collaborative FCPM Fuzzy Clustering Algorithm. Pattern Recognition and Artificial Intelligence, 2010, 23(1): 120-126.) [8] 王 骏,王士同,邓赵红.聚类分析研究中的若干问题.控制与决策, 2012, 27(3): 321-328. (WANG J, WANG S T, DENG Z H. Survey on Challenges in Clustering Analysis Research. Control and Decision, 2012, 27(3): 321- 328.) [9] TENENBAUM J B, FREEMAN W T. Separating Style and Content with Bilinear Models. Neural Computation, 2000, 12(6): 1247-1283. [10] VEERAMA-CHANENI S, NAGY G. Style Context with Second-Order Statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(1): 14-22. [11] ZHANG X Y, HUANG K Z, LIU C L. Pattern Field Classification with Style Normalized Transformation // Proc of the 22nd International Joint Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2011. DOI: 10.5591/978-1-57735-516-8/IJCAI11-272. [12] HUANG K Z, JIANG H C, ZHANG X Y. Field Support Vector Machines. IEEE Transactions on Emerging Topics in Computational Intelligence, 2017, 1(6): 454-463. [13] JIANG H C, HUANG K Z, ZHANG R. Field Support Vector Regression // Proc of the International Conference on Neural Information Processing Systems. Berlin, Germany: Springer, 2017: 699-708. [14] HIGGS B, ABBAS M. A Two-Step Segmentation Algorithm for Behavioral Clustering of Naturalistic Driving Styles // Proc of the 16th IEEE International Conference on Intelligent Transportation Systems. Washington, USA: IEEE, 2013. DOI: 10.1109/ITSC.2013.6728339. [15] PATTARIN F, PATERLINI S, MINERVA T. Clustering Financial Time Series: An Application to Mutual Funds Style Analysis. Computational Statistics and Data Analysis, 2004, 47(2): 353-372. [16] MARZINOTTO G, ROSALES J C, EL-YACOUBI M, et al. Age and Gender Characterization through a Two Layer Clustering of Online Handwriting // Proc of the International Conference on Advanced Concepts for Intelligent Vision Systems. Berlin, Germany: Springer, 2015: 428-439. [17] WANG Y T, CHEN L H, MEI J P. Incremental Fuzzy Clustering with Multiple Medoids for Large Data. IEEE Transactions on Fuzzy Systems, 2014, 22(6): 1557-1568. [18] BEZDEK J C. Pattern Recognition with Fuzzy Objective Function Algorithms[J/OL]. [2018-10-25]. https://link.springer.com/content/pdf/bfm%3A978-1-4757-0450-1%2F1.pdf. [19] KRISHNAPURAM R, JOSHI A, YI L Y. A Fuzzy Relative of the K-medoids Algorithm with Application to Web Document and Sni-ppet Clustering // Proc of the IEEE International Conference on Fuzzy Systems. Washington, USA: IEEE, 1999, III: 1281-1286. [20] FREY B J, DUECK D. Clustering by Passing Messages between Data Points. Science, 2007, 315(5814): 972-976. [21] ESTER M, KRIEGEL H P, SANDER J, et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise // Proc of the 2nd ACM International Conference on Know-ledge Discovery and Data Mining. New York, USA: ACM, 1996: 226-231. [22] STREHL A, GHOSH J. Cluster Ensembles: A Knowledge Reuse Framework for Combining Partitions. Journal of Machine Learning Research, 2002, 3: 583-617. [23] IWAYAMA M, TOKUNAGA T. Hierarchical Bayesian Clustering for Automatic Text Classification // Proc of the International Joint Conference on Artificial Intelligence. San Francisco, USA: Morgan Kaufmann, 1995: 1322-1327. [24] RAND W M. Objective Criteria for the Evaluation of Clustering Me-thods. Journal of the American Statistical Association, 1971, 66(336): 846-850. [25] LARSEN B, AONE C. Fast and Effective Text Mining Using Linear-Time Document Clustering // Proc of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 1999: 16-22. [26] PAL N R, BEZDEK J C. On Cluster Validity for the Fuzzy c-means Model. IEEE Transactions on Fuzzy Systems, 2002, 3(3): 370-379. [27] 于 剑.论模糊C均值算法的模糊指标.计算机学报, 2003, 26(8): 968-973. (YU J. On the Fuzziness Index of the FCM Algorithms. Chinese Journal of Computers, 2003, 26(8), 968-973.) [28] 高新波,裴继红,谢维信.模糊c-均值聚类算法中加权指数m的研究.电子学报, 2000, 28(4): 80-83. (GAO X B, PEI J H, XIE W X. A Study of Weighting Exponent m in a Fuzzy c-means Algorithm. Acta Electronica Sinica, 2000, 28(4): 80-83.) [29] TEIXEIRA A R, TOM A M, STADLTHANNER K, et al. KPCA Denoising and the Pre-image Problem Revisited. Digital Signal Processing, 2008, 18(4): 568-580. [30] SAKAR C O, POLAT S O, KATIRCIOGLU M, et al. Real-Time Prediction of Online Shoppers′ Purchasing Intention Using Multilayer Perceptron and LSTM Recurrent Neural Networks. Neural Computing and Applications, 2018(12): 1-16. [31] YANG C J, DENG Z H, CHOI K S, et al. Transductive Domain Adaptive Learning for Epileptic Electroencephalogram Recognition. Artificial Intelligence in Medicine, 2014, 62(3): 165-177. [32] JIANG Y Z, DENG Z H, CHUNG F L, et al. Recognition of Epileptic EEG Signals Using a Novel Multiview TSK Fuzzy System. IEEE Transactions on Fuzzy Systems, 2017, 25(1): 3-20. [33] XIONG Y J, ZHANG R, ZHANG C, et al. A Novel Estimation Method of Fatigue Using EEG Based on KPCA-SVM and Complexity Parameters. Applied Mechanics and Materials, 2013, 373/374/375: 965-969.